Papillon Lexical Database Project: Monolingual Dictionaries & Interlingual Links

نویسندگان

  • Gilles Sérasset
  • Mathieu Mangeot
چکیده

This paper presents a new research and development project called Papillon. It started as a French-Japanese cooperation between laboratories GETA/CLIPS (Grenoble, France) and NII (Tokyo, Japan). Its goal is to build a multilingual lexical database and to extract from it digital bilingual dictionaries. The database is built with monolingual dictionaries, one for each language of the database, linked to an interlingual dictionary. The pivot architecture of the database is based on Gilles Sérasset’s Ph.D. thesis. The structure of the monolingual dictionaries is based on the lexical work done by Igor Melc’uk and Alain Polguère. From the lexical database, it is planned to derive user customized bilingual dictionaries in multiple target formats. It will be possible to generate human usage dictionaries as well as specialized dictionaries for machine translation software. These dictionaries will be available under the terms of an open source license. This project, initiated by some computational linguists, aims at being useful and open to all those who are interested in Japanese and French. It is also opened to any other language. Moreover, the pivot architecture of the database will facilitate the addition of new languages and save translation efforts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The translation of examples , citations , definitions and glosses in the Papillon project

The Papillon lexical data base comprises a set of detailed monolingual dictionaries of « lexies » (word senses) interlinked through « axies » (interlingual links) which can also refer to external semanticsoriented systems such as UNL « universal words », Worldnet « synsets », Ontos « concepts » or NTT ALT/JE system « semantic classes ». The basic idea is that bilingual or multitarget usage dict...

متن کامل

Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors

The NLP team of LIRMM currently works on thematic and lexical disambiguation text analysis [Lafourcade, 2001]. We built a system, with automated learning capabilities, based on conceptual vectors for meaning representation. Vectors are supposed to encode ideas associated to words or expressions. In the framework of Acception Based Lexical Database (instantiated through the Papillon project), we...

متن کامل

A "Pivot" XML-Based Architecture for Multilingual, Multiversion Documents: Parallel Monolingual Documents Aligned Through a Central Correspondence Descriptor and Possible Use of UNL

We propose a structure for multilingual, multiversion documents, built on the model of the web-oriented, cooperative lexical multilingual data base PAPILLON: a document is represented by a collection of monolingual XML "volumes" interlinked by a central volume of "interlingual links". Here, the links relate subdocuments (XML trees) corresponding to each other in monolingual "volumes". We are de...

متن کامل

Hardening of Acception Links Through Vectorized Lexical Functions

In the framework of the Papillon project, we have defined strategies for populating a pivot dictionnary of interlingual links from monolingual vectorial bases. There are quite a number of acception per entry thus, the proper identification may be quite troublesome and some added clues beside acception links may be usefull. We improve the integrity of the acception base through welsl known seman...

متن کامل

The PAPILLON Project: Cooperatively Building A Multilingual Lexical Data-Base To Derive Open Source Dictionaries And Lexicons

The PAPILLON project aims at creating a cooperative, free, permanent, web-oriented and personalizable environment for the development and the consultation of a multilingual lexical database. The initial motivation is the lack of dictionaries, both for humans and machines, between French and many Asian languages. In particular, although there are large F-J paper usage dictionaries, they are usab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001